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In  recent  papers  we  have  indicated  the  applicability  of 
the  technique  of  successive  approximations  to  a  variety  of 
nonlinear  and  mu Itl-d linen slonal  problems  arising  in  the  theory 
of  dynamic  programming.  Here  we  Indicate  how  the  method,  In 
the  guise  of  approximation  In  policy  space,  can  be  used  to 
yield  monotone  approximation  for  linear,  quadratic  and  non¬ 
linear  programming.  Problems  of  actual  convergence  will  be 
discussed  elsewhere. 
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APPROXIMATION  IN  POLICY  SPACE, 
LINEAR  AND  NONLINEAR  PROORAMMINO 

Richard  Bellman 


1«  Introduction 

The  purpose  of  this  paper  is  to  present  some  applications 
of  the  technique  of  successive  approximations,  the  general 
factotum  of  analysis,  to  some  of  the  basic  problems  of  linear, 
quadratic  and  general  nonlinear  programming. 

We  shall  not  discuss  any  of  the  interesting  and  important 
problems  of  convergence  that  arise  in  this  way,  but  will 
content  ourselves  with  the  quite  simple  proof  of  monotone 
approximation. 

The  simplicity  of  proof  is  a  consequence  of  the  fact  that 
we  are  using  a  basic  concept  of  dynamic  programming,  [l]  , 
"approximation  in  policy  space." 

2,  Successive  Approximation  and  Linear  Programming 

Consider  the  problem  of  maximizing  the  linear  form 
N 


where  the 
(2)  (a) 

(b) 


x^  are  subject  to  the  restrictions 

Xi  >  0,  i  -  1,  2,  . . . ,  N, 

N 


J-1 


i' 


1-1,2, 


M. 


We  suppose  that  a^^  >0,  >  0,  with  a  sufficient  number  of 
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tht  positive  so  that  the  problem  is  sensible. 

Let  X®  «•  (xjiXg, . . .  ,x®)  be  any  set  of  x^  satisfying 
the  constraints  In  (2a)  and  (2b),  and  let 

(3)  ^  *  ( Xj  , Xg #  •  •  • » Xj^ _ 2*^Si— 


We  now  proceed  to  maximize  the  linear  expression 


(4) 


subject  to  the  constraints 


( 5 )  ( ® ^  ^  * 
N-2  Q  N 


Let  values  of  ^  determined  by  this  readily  resolved 

problem  be  Denote  the  vector  (^^j ^Xg, . . .  ,x2_2,x^_^ ,x^) 

by  x^  and  write 

( 6 )  X  ■■  ( x^ ,  Xg ,  *  ■ . ,  x^ ) . 


Now  fix  the  values  of  Xg, 
new  vector  x  given  by 


^N-1' 


and  consider  the 


(7) 


X 


). 


Consider  the  problem  of  maximizing  the  linear  expression 


(8) 


biXi  ^ 


■ubjtct  to  tho  constraints 
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(9)  (*)  x^,  ^ 

N-1 


This  again  is  simply  resolvad.  Call  a  set  of  maximizing  values 
2  2 

^1'  X||,  and  write 


(10) 


-  (xj,X2,...,xj^j,x^)  -  (Xj,x|, 


2  2  2 

The  next  step  is  to  fix  x^,  x^^,  ...,  xjj  and  maximize 
over  x^,  X|^.  Continuing  in  this  way,  we  obtain  a  sequence  of 
successive  approximations  to  the  true  maximum. 


3.  Monotonicity  of  Approximation 

To  show  that  we  obtain  a  sequence  of  vectors  {xJJJ  which 

yield  a  monotone  increasing  sequence  of  values  for  5  ^4X. , 

1-1  ^  ^ 

we  proceed  as  follows. 

It  is  clear  that  having  chosen  x^,  we  can  always  choose 

the  last  two  components  of  x  in  (2.3)  to  be  x^^  and  x^. 

Consequently,  when  we  maximize  over  x^_^  and  x^,  we  auto— 

N 

matically  obtain  a  value  of  ^  h.x.  at  least  as  large  as  that 

N  0  1-1  ^  ^ 

given  by 


4.  Nonlinear  Programming 

It  is  clear  that  the  aame  technique  can  be  applied  to  the 

N 

problem  of  maximizing  5  subject  to  a  aeries  of 

inequalities  of  the  form 
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(1)  (•)  >  0, 

(b)  <  «1>  1  -  1,  2,  ••..  M. 

A  dlfferenc*  Is  that  the  problem  of  maximizing  over  two  variables 
in  general  will  require  a  computational  solution  and  will  not 
possess  a  simple  explicit  solution  as  in  the  linear  case. 

5.  Convergence 

It  is  not  at  all  clear,  oven  in  the  linear  case  where 
there  is  a  unique  maximum,  rather  than  a  set  of  local  maxima, 
as  may  occur  in  the  nonlinear  case,  that  the  sequence  of  values 
converges  to  the  actual  maximum.  That  the  sequence  converges  is 
clear,  but  it  is  not  clear  to  what  it  converges. 

Consequently,  if  the  sequence  of  values  sticks  at  a  parti¬ 
cular  value,  the  thing  to  do  is  to  upset  the  cyclic  arraingement 
described  above  and  to  study  other  sets  of  two  values  at  a  time. 
Thus,  instead  of  (l,2),  (2,5),  (N-1,N),  (N,l),  we  can  use 

(1,4),  (4,7),  ...,  and  so  on. 

Furthermore,  Instead  of  using  a  fixed  sequence  of  pairs 
which  increases  the  probability  of  pathological  behavior,  we 
can  use  a  random  selection  of  pairs. 

It  would  be  interesting  to  know,  if  the  simple  technique 
described  above  docs  not  yield  convergence,  whether  this  is  the 
usual  situation,  or  whether  it  occurs  with  small  probability. 

6,  Quadratic  Programming 

As  another  application  of  successive  approximations,  let  us 
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oonelder  the  problem  of  maximizing  (x,Ax),  where  A  ie  a 
poeltlve  definite  matrix,  subject  to  the  constraints  of  (2.2). 
Write 


(1)  (x,Ax)  -  (y  -»■  X  -  y,  A(y  X  -  y)) 

-  (y»  Ay)  2(x  -  y,  Ay)  ♦  (x  -  y,  A(x  -  y)). 


It  follows  that 

(2)  (x,Ax)  >  (y,Ay)  2(x  -  y,  Ay) 
for  all  y  and  x. 

Consider  then,  in  place  of  the  original  nonlinear  problem, 
the  problem  of  maximizing 

(3)  J(x,x^)  •  (x^,Ax^)  •¥  2(x  -  x^,Ax®) 

over  all  x  subject  to  the  constraints  of  (2.2),  where  x^  is 
an  initial  guess  satisfying  (2.2). 

Since  X  -  x^  is  a  feasible  choice,  it  follows  that  any 
X  which  yields  the  maximum  of  (3)  fumlshes  a  value,  x^, 
which  yields  a  larger  value  of  (x,Ax).  For 

(4)  (x^,Ax^)  >  (x®,Ax^)  2(x^  —  x^,Ax^)  2  (x^,Ax^), 

by  virtue  of  the  maximum  property. 

The  original  nonlinear  problem  has  thus  been  reduced  to  a 
sequence  of  linear  problems.  We  shall  discuss  the  convergence 
question  elsewhere. 
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